-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Comprehensive Circuit Breaker User Guide for KMesh Kernel-Native Implementation #110
base: main
Are you sure you want to change the base?
Add Comprehensive Circuit Breaker User Guide for KMesh Kernel-Native Implementation #110
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for kmesh-net ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
6448517
to
db6f23f
Compare
@@ -0,0 +1,253 @@ | |||
--- | |||
draft: false | |||
linktitle: Circuit Breaker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls add Chinese guide too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DeshDeepakKant Is not from China, he does not speak Chinese
spec: | ||
containers: | ||
- name: service | ||
image: your-service-image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should provide a runnable application, please refer to other guide
|
||
```bash | ||
# Install hey load testing tool | ||
go install github.com/rakyll/hey@latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, can use fortio (https://github.com/kmesh-net/kmesh/tree/main/samples/fortio) as well
go install github.com/rakyll/hey@latest | ||
|
||
# Generate load | ||
hey -n 1000 -c 50 http://sample-service/endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how ciercuit breaking work as this is run from a binary, how is the client managed by kmesh
|
||
```bash | ||
# View KMesh circuit breaker logs | ||
kubectl logs -n kmesh -l app=kmesh circuit-breaker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
-n kmesh-system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hzxuzhonghu Thank you for the review. I have made the changes and pushed them. The updates are running smoothly and have been verified on my end. If any further adjustments are needed, please let me know.
Additionally, I've attached some terminal logs for your reference. Could you please confirm if everything looks correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you be sure that what you are submitting is verified by yourself? |
5295889
to
2545431
Compare
…uide - Implement detailed documentation for Circuit Breaker feature - Include technical implementation details - Provide configuration examples and best practices - Cover troubleshooting and monitoring aspects Signed-off-by: Desh Deepak Kant <deshdeepakkant@gmail.com> Closes kmesh-net#103 Signed-off-by: DeshDeepakKant <deshdeepakkant@gmail.com>
Signed-off-by: DeshDeepakKant <deshdeepakkant@gmail.com>
Signed-off-by: DeshDeepakKant <deshdeepakkant@gmail.com>
2545431
to
4bc5264
Compare
Keywords which can automatically close issues and at(@) or hashtag(#) mentions are not allowed in commit messages. The list of commits with invalid commit messages:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
fortio load -c 2 -qps 20 -t 30s http://test-service | ||
``` | ||
|
||
### Analyzing Results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show the results as well?
Logs or metrics are fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LiZhenCheng9527 Thank you for your response. Could you kindly clarify what you mean by "results"? Are you referring to the terminal result logs to include in the user guide?
For your convenience, I have attached the terminal log below for reference. Please let me know if this aligns with what you were looking for or if there are any additional details or metrics you'd like me to provide.
desh@pop-os:~/kt$ # Heavy load to trigger circuit breaker
kubectl exec -it deploy/fortio -- \
fortio load -c 5 -qps 100 -t 30s http://test-service
# Verify circuit breaker status
kubectl get destinationrule test-circuit-breaker -o yaml
# Simulate service failure
kubectl scale deployment test-service --replicas=0
15:15:39.852 r1 [INF] scli.go:122> Starting, command="Φορτίο", version="1.66.5 h1:WTJzTGOA12YWZSM5g43602lH+GOsmP3eKHXLnuRW4vs= go1.22.7 amd64 linux", go-max-procs=12
Fortio 1.66.5 running at 100 queries per second, 12->12 procs, for 30s: http://test-service
15:15:39.853 r1 [INF] httprunner.go:121> Starting http test, run=0, url="http://test-service", threads=5, qps="100.0", warmup="parallel", conn-reuse=""
Starting at 100 qps with 5 thread(s) [gomax 12] for 30s : 600 calls each (total 3000)
15:16:09.858 r84 [INF] periodic.go:851> T003 ended after 30.000992951s : 600 calls. qps=19.99933805457598
15:16:09.858 r83 [INF] periodic.go:851> T002 ended after 30.000983273s : 600 calls. qps=19.99934450615098
15:16:09.858 r82 [INF] periodic.go:851> T001 ended after 30.001020274s : 600 calls. qps=19.99931984046497
15:16:09.858 r85 [INF] periodic.go:851> T004 ended after 30.001037727s : 600 calls. qps=19.99930820592978
15:16:09.858 r81 [INF] periodic.go:851> T000 ended after 30.001090238s : 600 calls. qps=19.99927320107946
Ended after 30.001156064s : 3000 calls. qps=99.996
15:16:09.858 r1 [INF] periodic.go:581> Run ended, run=0, elapsed=30001156064, calls=3000, qps=99.99614660182583
Sleep times : count 2995 avg 0.049068316 +/- 0.0003238 min 0.048044947 max 0.049896243 sum 146.959607
Aggregated Function Time : count 3000 avg 0.00034710756 +/- 9.911e-05 min 0.000126072 max 0.000813633 sum 1.04132267
# range, mid point, percentile, count
>= 0.000126072 <= 0.000813633 , 0.000469852 , 100.00, 3000
# target 50% 0.000469738
# target 75% 0.000641685
# target 90% 0.000744854
# target 99% 0.000806755
# target 99.9% 0.000812945
Error cases : no data
# Socket and IP used for each connection:
[0] 1 socket used, resolved to 10.96.230.153:80, connection timing : count 1 avg 0.000159577 +/- 0 min 0.000159577 max 0.000159577 sum 0.000159577
[1] 1 socket used, resolved to 10.96.230.153:80, connection timing : count 1 avg 0.000138426 +/- 0 min 0.000138426 max 0.000138426 sum 0.000138426
[2] 1 socket used, resolved to 10.96.230.153:80, connection timing : count 1 avg 0.000130691 +/- 0 min 0.000130691 max 0.000130691 sum 0.000130691
[3] 1 socket used, resolved to 10.96.230.153:80, connection timing : count 1 avg 0.000233499 +/- 0 min 0.000233499 max 0.000233499 sum 0.000233499
[4] 1 socket used, resolved to 10.96.230.153:80, connection timing : count 1 avg 0.000172922 +/- 0 min 0.000172922 max 0.000172922 sum 0.000172922
Connection time histogram (s) : count 5 avg 0.000167023 +/- 3.646e-05 min 0.000130691 max 0.000233499 sum 0.000835115
# range, mid point, percentile, count
>= 0.000130691 <= 0.000233499 , 0.000182095 , 100.00, 5
# target 50% 0.000169244
# target 75% 0.000201372
# target 90% 0.000220648
# target 99% 0.000232214
# target 99.9% 0.00023337
Sockets used: 5 (for perfect keepalive, would be 5)
Uniform: false, Jitter: false, Catchup allowed: true
IP addresses distribution:
10.96.230.153:80: 5
Code 200 : 3000 (100.0 %)
Response Header Sizes : count 3000 avg 238 +/- 0 min 238 max 238 sum 714000
Response Body/Total Sizes : count 3000 avg 853 +/- 0 min 853 max 853 sum 2559000
All done 3000 calls (plus 5 warmup) 0.347 ms avg, 100.0 qps
apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"networking.istio.io/v1alpha3","kind":"DestinationRule","metadata":{"annotations":{},"name":"test-circuit-breaker","namespace":"default"},"spec":{"host":"test-service","trafficPolicy":{"connectionPool":{"http":{"http1MaxPendingRequests":1,"maxRequestsPerConnection":1}},"outlierDetection":{"baseEjectionTime":"30s","consecutive5xxErrors":3,"interval":"5s"}}}}
creationTimestamp: "2025-01-23T13:22:14Z"
generation: 1
name: test-circuit-breaker
namespace: default
resourceVersion: "64439"
uid: 07ed1da0-79c7-45a8-81b8-a7912e6d1568
spec:
host: test-service
trafficPolicy:
connectionPool:
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
outlierDetection:
baseEjectionTime: 30s
consecutive5xxErrors: 3
interval: 5s
deployment.apps/test-service scaled
desh@pop-os:~/kt$ # Restore service
kubectl scale deployment test-service --replicas=1
# Test recovery
kubectl exec -it deploy/fortio -- \
fortio load -c 2 -qps 20 -t 30s http://test-service
deployment.apps/test-service scaled
15:16:13.095 r1 [INF] scli.go:122> Starting, command="Φορτίο", version="1.66.5 h1:WTJzTGOA12YWZSM5g43602lH+GOsmP3eKHXLnuRW4vs= go1.22.7 amd64 linux", go-max-procs=12
Fortio 1.66.5 running at 20 queries per second, 12->12 procs, for 30s: http://test-service
15:16:13.095 r1 [INF] httprunner.go:121> Starting http test, run=0, url="http://test-service", threads=2, qps="20.0", warmup="parallel", conn-reuse=""
15:16:13.097 r52 [ERR] http_client.go:954> Unable to connect, dest={"IP":"10.96.230.153","Port":80,"Zone":""}, err="dial tcp 10.96.230.153:80: connect: connection refused", numfd=7, thread=1, run=0
15:16:13.097 r51 [ERR] http_client.go:954> Unable to connect, dest={"IP":"10.96.230.153","Port":80,"Zone":""}, err="dial tcp 10.96.230.153:80: connect: connection refused", numfd=6, thread=0, run=0
Aborting because of error -1 for http://test-service (0 bytes)
command terminated with exit code 1
desh@pop-os:~/kt$ # View test results
kubectl logs deploy/fortio
Found 2 pods, using pod/fortio-deploy-5669d4866b-rwlzj
{"ts":1737903211.076985,"level":"info","r":1,"file":"updater.go","line":50,"msg":"Configmap flag value watching on /etc/fortio"}
{"ts":1737903211.077518,"level":"crit","r":1,"file":"scli.go","line":83,"msg":"Unable to watch config/flag changes in /etc/fortio: dflag: error initializing fsnotify watcher"}
{"ts":1737903211.077641,"level":"info","r":1,"file":"scli.go","line":122,"msg":"Starting","command":"Φορτίο","version":"1.66.5 h1:WTJzTGOA12YWZSM5g43602lH+GOsmP3eKHXLnuRW4vs= go1.22.7 amd64 linux","go-max-procs":12}
{"ts":1737903211.079770,"level":"info","r":1,"msg":"Fortio 1.66.5 tcp-echo server listening on tcp [::]:8078"}
{"ts":1737903211.079867,"level":"info","r":1,"msg":"Fortio 1.66.5 udp-echo server listening on udp [::]:8078"}
{"ts":1737903211.079908,"level":"info","r":1,"msg":"Fortio 1.66.5 grpc 'ping' server listening on tcp [::]:8079"}
{"ts":1737903211.080657,"level":"info","r":1,"msg":"Fortio 1.66.5 https redirector server listening on tcp [::]:8081"}
{"ts":1737903211.082276,"level":"info","r":1,"msg":"Fortio 1.66.5 http-echo server listening on tcp [::]:8080"}
{"ts":1737903211.082363,"level":"info","r":1,"msg":"Data directory is /var/lib/fortio"}
{"ts":1737903211.082392,"level":"info","r":1,"msg":"REST API on /fortio/rest/run, /fortio/rest/status, /fortio/rest/stop, /fortio/rest/dns"}
UI started - visit:
http://localhost:8080/fortio/
(or any host/ip reachable on this server)
{"ts":1737903211.083216,"level":"info","r":1,"msg":"Debug endpoint on /debug, Additional Echo on /debug/echo/, Flags on /fortio/flags, and Metrics on /debug/metrics"}
{"ts":1737903211.083302,"level":"info","r":1,"file":"fortio_main.go","line":307,"msg":"All fortio 1.66.5 h1:WTJzTGOA12YWZSM5g43602lH+GOsmP3eKHXLnuRW4vs= go1.22.7 amd64 linux servers started!"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If circuit breaker is configured, then when multiple links access the service, the access will fail.
You can start by providing the results of a fortio without a circuit breaker configured.
IP addresses distribution: 10.96.230.153:80: 5
Code 200 : 3000 (100.0 %)
Then provide the results for a fortio with a circuit breaker configured.
IP addresses distribution: 10.96.230.153:80: 5
Code 200 : 1914 (63.8%)
Code 503 : 1086 (36.2%)
Description
During the Open Source Promotion Plan (OSPP), KMesh has successfully implemented a circuit breaker mechanism in Kernel-Native mode. However, the current documentation lacks a comprehensive user guide to help developers understand and utilize this feature effectively.
Objectives
Key Components